Development of Sindhi text corpus
نویسندگان
چکیده
منابع مشابه
Sentiment Summerization and Analysis of Sindhi Text
Text corpus is important for assessment of language features and variation analysis. Machine learning techniques identify the language terms, features, text structures and sentiment from linguistic corpus. Sindhi language is one of the oldest languages of the world having proper script and complete grammar. Sindhi is remained less resourced language computationally even in this digital era. Vie...
متن کاملCorpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملDevelopment of Unicode based Sindhi Typing System
This paper presents a first attempt in designing and development of Unicode based Sindhi Typing System for the Sindhi speaking community. The Sindhi Typing project is developed in order to improve the typing speed of Sindhi computing professionals as no such system currently exists. It is Platform independent application requiring no third party plugin or any regional languages support. No Sind...
متن کاملDesign & Development of the Graphical User Interface for Sindhi Language
This paper describes the design and implementation of a Unicode-based GUISL (Graphical User Interface for Sindhi Language). The idea is to provide a software platform to the people of Sindh as well as Sindhi diasporas living across the globe to make use of computing for basic tasks such as editing, composition, formatting, and printing of documents in Sindhi by using GUISL. The implementation o...
متن کاملVisualization of Text Document Corpus
From the automated text processing point of view, natural language is very redundant in the sense that many different words share a common or similar meaning. For computer this can be hard to understand without some background knowledge. Latent Semantic Indexing (LSI) is a technique that helps in extracting some of this background knowledge from corpus of text documents. This can be also viewed...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of King Saud University - Computer and Information Sciences
سال: 2021
ISSN: 1319-1578
DOI: 10.1016/j.jksuci.2019.02.002